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^ . Abstract 

G^ 

a^ 

Using cosmological A^-body simulations and the void probability function (VPF), we investigate 
■ the statistical properties of voids within a wide range of initially Gaussian models for the origin 

^ \ of large-scale structure. We study the dependence of the VPF on cosmological parameters, on the 

power spectrum of primordial fluctuations, and on assumptions about galaxy formation. We pay 
particular attention to the ability of the VPF to diagnose 'biased galaxy formation': the preferential 
formation of galaxies in regions of high background density and corresponding suppression of galaxy 
formation in regions of low background density. We find that the VPF is insensitive to the cosmic 
density parameter VLq and the cosmological constant Aq , provided that fluctuations are normalized to 
^) \ a fixed rms amplitude on scales ~ 8 Mpc. In the absence of biasing, the VPF is also insensitive 

' to the shape of the initial power spectrum. The VPF does depend on the prescription adopted 

, for biased galaxy formation, in the obvious sense that a scheme that more efficiently suppresses 

I galaxy formation in low density regions leads to larger voids. Biased models have systematically 

0> ■ higher VPFs than unbiased models, but for a given biasing scheme the VPF is relatively insensitive 

r"| ! to the value of the bias factor 6, the ratio of rms galaxy fiuctuations to rms mass fiuctuations. 

Qh' Thus, while the VPF can distinguish unbiased models from some biased models, it is probably 

Q . not a useful way to constrain the bias factor; uncertainties in the appropriate choice of biasing 

\ prescription overwhelm the mild dependence on h. 

d • We compare the predictions of our models to the most extensive VPF observations published 

to date. These data do not require strong biasing; Gaussian models in which galaxies trace mass 
can reproduce the VPF data to within the errors expected from the current finite volume fluctua- 
tions. Models with the moderate biasing predicted by cosmological simulations that incorporate gas 
\ dynamics yield a slightly better match to the data. Models in which galaxy formation is strongly 

suppressed in low density regions produce an excess of large, empty voids. 
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1. Introduction 

Giant voids are among the most striking features of the observed distribution of galaxies {e.g. 
Gregory &; Thompson 1978; Kirshner et al. 1981, 1987; Davis et al. 1982; de Lapparcnt, Geller & 
Huchra 1986; see review by Rood 1988 and references therein). The remarkable interlocking pat- 
tern of supcrclustcrs and voids revealed by galaxy redshift surveys has prompted various authors 
to describe the observed galaxy distribution as a "cell structure" (Joeveer & Einasto 1978), a foam 
of "bubbles" (de Lapparent, Geller Sz Huchra 1986), or a "sponge-like" network of interlocking 
filaments and tunnels (Gott, Melott & Dickinson 1986). The most widely explored models for the 
origin of large-scale structure propose that the clusters, superclusters, and voids that we observe to- 
day developed by gravitational instability from small-amplitude, Gaussian fluctuations, generated 
by physical processes in the very early universe. Can the gravitational growth of Gaussian primor- 
dial fluctuations account for the observed voids, or do these models require that galaxy formation 
be suppressed in low density regions in order to produce voids as large and empty as observed? 
Do voids represent regions where there is no mass, or merely regions where there are no (bright) 
galaxies? In this paper we address these questions using cosmological A?^-body simulations and the 
void probability function (VPF), a simple statistical measure of the sizes of voids. The VPF of 
a galaxy sample is the probability Po{R) that a randomly placed sphere of radius R contains no 
galaxies. We study the dependence of the VPF on cosmological parameters, on the power spectrum 
of the primordial fluctuations, and - above all - on assumptions about galaxy formation. We also 
compare our results to the most extensive VPF observations published to date (Vogeley, Geller & 
Huchra 1991). 

The nature of voids is intimately connected to the issue of 'biased galaxy formation,' an idea 
that first gained popularity in the context of the cold dark matter (CDM) model of structure 
formation. The most theoretically attractive version of CDM assumes a critical density (il = 1) 
universe. Observations of cluster mass-to-light ratios, cluster velocity dispersions, and the galaxy 
pairwise velocity dispersion clearly contradict this assumption i/ galaxies are clustered in the same 
way as mass {e.g., Davis et al. 1985, hereafter DEFW). However, if galaxy formation is more 
efficient (per unit mass) in regions of high background density, then galaxies will cluster more 
strongly than the underlying mass distribution, and an $7 = 1 model can, perhaps, be reconciled 
with the observations (DEFW; Bardeen et al. 1986, hereafter BBKS). While the term 'biasing' 
might be used to describe any difference between the large-scale galaxy and mass distributions, we 
will use it in the specific sense mentioned above: preferential formation of galaxies in regions of high 
background density. If galaxies form more efficiently than average in high density regions, then 
they must form less efficiently than average in low density regions, so biasing naturally produces 
voids that are empty of galaxies but not completely empty of mass. 

There are observational and theoretical reasons for thinking that biasing might be an im- 
portant phenomenon, independent of the J7 = 1 assumption. On the observational side, the well 
known morphology-density relation implies that elliptical and spiral galaxies have different clus- 
tering properties (Dressier 1980; Postman & Geller 1984). At least one of these classes of galaxies 
must cluster differently than the mass, and there is no particular reason to think that either class 
individually or the union of the two traces the large-scale mass distribution. On the theoretical 
side, galaxy-scale perturbations collapse earlier in regions of high background density, so they tend 
to reach higher internal densities and cool more efficiently than equivalent perturbations in low 
density regions. Indeed, numerical simulations that include gas dynamics indicate that galaxy for- 
mation is at least somewhat biased towards regions of high background density (Cen &; Ostriker 
1992; Katz, Hernquist & Weinberg 1992; see also White et al. 1987; Kaiser 1988; Gelb 1992). 
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It has also been argued that gravity alone cannot create voids as large and empty as those 
observed, and that the existence of these voids is itself evidence for biased galaxy formation (e.g. 
Dekel & Rees 1987; Betancort-Rijo 1990; Einasto et al. 1991, hereafter EEGS). This line of 
reasoning, if correct, enhances the plausibility of J7 = 1 models by providing independent evidence 
for biasing, and it suggests that a measure of void sizes like the VPF might offer a sensitive statistical 
diagnostic of biasing. We will assess the strength of the VPF as a biasing diagnostic, and we will 
examine the ability of current theoretical models to explain the observed spectrum of void sizes. 

There have been several recent iV-body studies of the gravitational growth and interactions 
of voids, in idealized configurations (Dubinski et al. 1993; see also West, Weinberg & Dekel 1990), 
in void-dominated structure models (Regos & Geller 1991), and in models with Gaussian initial 
conditions (van de Weygaert &; van Kampen 1993). These papers did not examine the void prob- 
ability function or the effects of biased galaxy formation, which will be the central concerns of our 
investigation. The two most direct predecessors of the present study are the above cited work by 
EEGS and the paper by Weinberg k Cole (1992; hereafter WC). The latter applied the VPF (and 
other clustering statistics) to A^-body models with Gaussian and non-Gaussian initial conditions. 
WC found the VPF to be a sensitive discriminant between Gaussian and non-Gaussian models in 
the absence of biasing, but they found, unsurprisingly, that biasing could create large voids in all 
models. Gaussian models proved more successful than any of WC's non-Gaussian models in ex- 
plaining the full range of galaxy clustering data, and the Gaussian hypothesis has received further 
observational support from recent analyses of large-scale galaxy counts (Bouchet et al. 1993 and 
references therein). Since Gaussian fluctuations have both theoretical simplicity and a degree of 
empirical success on their side, we will restrict our attention to Gaussian models in this paper. 

Because of our relatively narrow focus, we have been able to improve upon the studies of EEGS 
and/or WC in a number of ways: 

(1) EEGS's models were normalized by arranging for them to have the same final rms mass fluc- 
tuation on a particular scale. However, the distribution of total (luminous -|- dark) mass in 
the real universe is not well-determined. A more meaningful comparison between theory and 
observation can therefore be obtained if one's theoretical models are adjusted to share an ob- 
servable property, such as the rms fluctuation in galaxy counts on a particular scale. This is 
what we do, and it constitutes the most important difference of principle between our study 
and that of EEGS. 

(2) The initial conditions of A^-body simulations cannot include contributions from Fourier density 
waves that are bigger than the simulation cube, and the enforcement of periodic boundary con- 
ditions during iV-body evolution prevents such power from developing. A comparable volume 
of the real universe does not suffer from these restraints. iV-body simulations systematically 
underestimate the frequency of large voids if the simulation volume is not large enough to 
contain all waves that significantly influence the growth of underdense regions. This limitation 
may have affected the results of EEGS and WC, whose periodic N-hodj cubes had sides of 
40 and 128 Mpc, respectively {h = Hq/IQQ km s~^ Mpc~^). By contrast, Kauffmann & 
Melott (1992) suggest that a CDM simulation would require a ~ 160 Mpc box 'before one 
could begin to trust the typical size of voids'. We use simulation cubes that are significantly 
larger than those of EEGS and WC - either 300 or 192 Mpc on a side. The large simula- 
tion volume minimizes the systematic effects of missing large-scale waves, and it reduces the 
statistical uncertainty in our estimates of the VPF. 

(3) EEGS and WC employ a single biasing prescription, which we shall call 'density biasing'. Both 
sets of authors evolve an ensemble of mass points to redshift 2 = 0, smooth the resultant density 
field, and identify as 'galaxies' only those particles above some sharp cutoff in local density. We 
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also investigate this biasing prescription, but we examine two other biasing schemes as well. 
One of these is the well-known 'peaks biasing' prescription, which identifies galaxies with high 
peaks of the initial density field {e.g., DEFW; BBKS). For our simulations of the standard 
CDM model, we also consider the biasing relation derived by Cen k. Ostriker (1993) from their 
hydrodynamic simulations of this scenario. Our three biasing prescriptions differ substantially, 
and together they span a range of physically relevant possibilities. 

(4) EEGS examined biased versions of only 3 broad types of theoretical models: a distribution 
of randomly placed galaxies at z = 0; a distribution of randomly placed clusters at z = 0; 
and a numerically evolved, spatially flat, Gaussian model with the initial power spectrum of 
r^o = 0.2 CDM. Only the last of these exhibits the complex clustering characteristic of the 
observed galaxy distribution. WC evolved 32 theoretical models, but three quarters of them 
were associated with non-Gaussian initial conditions, and they used only pure power-law initial 
spectra. We will present some results for power-law spectra, but will concentrate our analysis 
on initial power spectra with stronger theoretical and empirical motivation. We do not consider 
non-Gaussian models, but we achieve much better coverage of the most physically promising 
cosmological models than either WC or EEGS. 

(5) The VPF depends rather sensitively on the number density n of the galaxy sample used to 
measure it - the sparser the sample, the bigger the voids. While EEGS computed only the 
VPF, we also compute a related statistic - the underdense probability function ('UPF' or i-*8o) 
- that is independent of n (except for shot noise, which becomes unimportant at large void 
radii). 

(6) WC examined only two biasing strengths: 6 = 1 (unbiased) and 6 = 2, where 6 is the 'bias 

factor' defined by the relation fJg. galaxies = 6 0"8,mass- Here cjg is the rms fiuctuation of the 
density contrast in randomly placed spheres of radius 8 Mpc. While we also explore 6 = 1 
and 6 = 2, we investigate two additional biasing strengths of interest (6 = 1.5 and 6 = 3). 

(7) We improve upon the comparison between theory and observation in three respects: 

(i) The VPF data are in redshift space, but EEGS and WC compare them with theoretical 
predictions computed in real space. This can be misleading if the move from real space 
to redshift space significantly changes the VPF. We compute the VPF both in real space 
and in redshift space, but - as consistency requires — we compare the VPF data only with 

redshift-space predictions. 

{a) Vogeley, Geller, & Huchra (1991, hereafter VGH) present the most extensive VPF data 
published to date. Most of these data were unavailable to EEGS, and WC use results 
from only 2 of VGH's 12 volume-limited samples. We compare our predictions with all 12 
of these samples, some of which extend to 126 Mpc. The samples used by WC and 
EEGS all had hmiting depths of < 80 /i"^ Mpc. 

{iii) Unlike both EEGS and WC, we quantify the effects of finite volume errors on current 
estimates of the VPF. 

The next section provides a detailed description of our models, §3 presents our main results, 
and S4 summarizes our conclusions. 
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2. The Models 



2.1 Initial Conditions 

Our basic set of initial conditions consists of an isotropic Gaussian random field (GRF) of 

density fluctuations, 5{r) = [p(f) — 'p]/'p, deflned within a triply-periodic simulation cube. Such 
a field can be expanded in terms of complex Fourier components 6^ = e*^^ , and its statistical 
properties are completely determined by its power spectrum P{k) = For a GRF, the 

phases are randomly distributed in the interval [0, 2tt], and the amplitudes are Rayleigh- 
distributed with variance P{k). Wc ensure these conditions by assigning values drawn from a 
Gaussian distribution with variance P{k)/2 separately to the real and imaginary parts of each 6-^ 
(cf. BBKS). These Fourier components then allow the computation of a GRF S{r) with a specified 
power spectrum P{k). 

The three cosmological scenarios that we examine in greatest detail have initial power spectra 
described by equation (7) of Efstathiou, Bond, & White (1992; hereafter EBW), with two difl'erent 
values of their T' parameter. With T = Qoh, this equation provides an accurate flt to the post- 
recombination, linear power spectrum of adiabatic fluctuations that enter the horizon with scale- 
invariant amplitudes and grow thereafter in a universe dominated by cold dark matter. Our first 
model is 'standard CDM', with Q = 1 and h = 0.5 (implying T = 0.5). While the standard 
CDM model has many attractive features, a number of observations suggest that the T = 0.5 
spectrum lacks sufficient power on large scales {e.g., Efstathiou et al. 1990; Maddox et al. 1990; 
Saunders et al. 1991; Moore et al. 1992; Vogeley et al. 1992; Fisher et al. 1993). A better fit to 
galaxy clustering data is obtained for F = 0.25, and wc consider two models with this spectrum. 
The first is an VIq = 0.3 CDM model, which naturally produces this power spectrum if /i ~ 0.8. 
For our standard version of this model we include a cosmological constant, Aq = A/3Hq = 0.7, so 
that the spatial curvature vanishes, as predicted by inflationary cosmology (see review by Narlikar 
& Padmanabhan 1991). We also examine an Q = 1 model with F = 0.25. There are various 
physical scenarios that could produce a F ~ 0.25 spectrum in an O = 1 universe. They include 
models in which equalization occurs late because a decaying particle enhances the background 
neutrino density (e.g., Bond & Efstathiou 1991) or because the Hubble constant is very low. One 
particularly interesting possibility is a universe dominated by a mixture of hot and cold dark 
matter - such a model may best explain galaxy clustering and velocity statistics, as well as the 
microwave background fluctuations recently measured by COBE (e.g., EBW; Davis, Summers & 
Schlegel 1992; Taylor &; Rowan-Robinson 1992; Klypin et al. 1993). From a theoretical perspective, 
all of the proposed models that yield a F = 0.25 spectrum seem somewhat contrived, but one can 
remain agnostic on the subject of theoretical underpinning and simply view the F = 0.25 spectrum 
as a reasonably successful empirical flt to observational data. 

Figure 1 shows the F = 0.5 and F = 0.25 initial power spectra, extrapolated via linear pertur- 
bation theory to z = and normalized so that iT8,mass = 1; i-^-, so that the rms mass fluctuation is 
unity in randomly placed spheres of radius 8 Mpc. The F = 0.25 spectrum has more power on 
scales larger than this normalization scale. The vertical arrows in this plot indicate the lowest and 
highest comoving wavenumbers included in our initial conditions: k/k{ = 1 (the wavenumber of 
the fundamental mode of our 300 h~ ^ Mpc simulation box) and k/k{ = 50 (the Nyquist frequency 
of the 100^ grid used to set up our initial conditions). Note that both of the spectra in Figure 
1 approach an n = 1 power law as A; — > 0, in agreement both with inflationary cosmology (see 
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Figure 1 — Initial power spectra for our F = 0.5 and F = 0.25 models. Both spectra have been extrapolated via 
linear perturbation theory to z = 0, and normalized so that the rms mass fluctuation is unity in randomly placed 
spheres of radius 8 Mpc. On scales larger than this, the T = 0.25 spectrum has more power than the T = 0.5 
spectrum. Vertical arrows indicate the lowest and highest comoving wavenumbers included in our initial conditions: 
k = kf (the wavenumber of the fundamental mode of our 300 Mpc simulation box) and k = 50 kf (the Nyquist 
frequency of the 100^ grid used to set up our initial conditions). 

Narlikar &: Padmanabhan 1991) and with COBE's recent observations of temperature fluctuations 
in the cosmic microwave background radiation (Smoot et al. 1992). 

In order to study the dependence of the VPF on cosmological parameters, we also consider 
models with the F = 0.25 power spectrum and (f^o,Ao) = (0.3, 0), (0.1, 0.9), and (0.1, 0). We do 
not change the power spectrum in concert with JIq because we want to separate dependence on 
cosmological parameters from dependence on the power spectrum. We will show in §3.3 that the 
VPF is insensitive to J7o and Aq if the power spectrum, normalization, and biasing scheme are held 
fixed. We therefore do not carry out a detailed comparison between these last three models and 
observational data. 

The r = 0.25 and F = 0.5 power spectra are not radically different over the range of scales 
probed by our simulations. We want to study the dependence of the VPF on the shape of the initial 
spectrum using a somewhat larger lever arm than these models provide, so we also examine models 
with pure power-law initial spectra, P{k) oc A;". A pure power-law spectrum that failed to turn over 
or cut off would violate observational constraints on either very large or very small scales, but a 
power law might be a reasonable approximation to the true power spectrum over some finite range. 
'Standard CDM' predicts a post-recombination power spectrum approximately like P{k) oc on 
comoving scales ~ 3 — 10 Mpc, and a variety of observations support this prediction (Gott 
& Rees 1975; Gott & Turner 1977; Bhavsar, Gott & Aarseth 1981; Gunn 1982; Gott et al. 1989, 
1992). For our power-law spectra, we therefore consider n = — 1 and its 'bracketing values' of 
n = — 2 and n = 0. 
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2.2 Non-linear evolution 

We evolve our initial conditions into the non-linear regime using a particle-mesh (PM) iV-body 
code written by Changbom Park. This code employs a staggered-mesh technique (Melott 1986) to 
achieve higher force resolution than a conventional PM code. The code is thoroughly described by 
Park (1990), and it has been tested against analytic solutions and other A^-body codes (Park 1990; 
Weinberg et al., in preparation, hereafter W+). The W+ tests show that this PM code reproduces 
the results of high-resolution P^M {e.g., Efstathiou et al. 1985) and tree {e.g., Hernquist, Bouchet, 
& Suto 1991) A'"-body codes down to the limit of its force resolution, about 1-2 mesh cells. The 
chief advantage of the PM method is speed - P'^M or tree code simulations with the same mass 
resolution as our PM experiments would have taken several times longer to perform. 

PM simulations require a trade-off between force resolution and the size of the simulation 
volume, since the dimensions of the computational grid are usually limited by memory and cpu 
constraints. We need a large simulation volume for the reasons discussed in §1, but because the 
VPF is insensitive to small-scale clustering we can get away with relatively low force resolution. 
Our simulations of T' models (those with a CDM-like power spectrum) use 100^ particles on a 
200^ density-potential mesh to represent a 300 Mpc, comoving, periodic cube. For simulations 
with power-law initial spectra (about whose realism wc arc somewhat less concerned), we use 64^ 
particles on a 128^ mesh, and a 192 Mpc simulation cube. In all cases the mesh scale is ~ 1.5 

Mpc. We have used trial runs of smaller volumes to confirm that we obtain the same VPF with 
a force resolution (mesh scale) of 0.5, 0.75, 1.0, and 1.5 Mpc. We have also checked that the 
VPF is unaffected by using 1 particle per 8 cells during dynamical evolution instead of 1 particle 
per cell; the sparser particle grid allows us to use a larger simulation volume with a fixed computer 
memory. Although we use a 200^ mesh (or 128^ for power-law models) to compute forces during 
non-linear evolution, we set up our initial conditions using a 100'^ (or 64^) density field so that we 
do not alias high frequency power into our initial particle distributions. 

We use the Zel'dovich approximation to turn our initial density fields into initial positions and 
growing-mode velocities for our A*" particles (see Doroshkevich et al. 1980; Efstathiou et al. 1985). 
All our simulations begin at a redshift Zinit and advance to the present epoch in Zinit timesteps, which 
are equally spaced in the expansion factor a{t). We use 2;init = 31 for the T' model simulations, 
and Zinit = 24 for the power-law simulations. Again, we have used trial runs on smaller volumes to 
check our choices of initial redshift and timestep. Doubling Zinit or the number of timesteps would 
not alter the VPF of our simulations. Halving Zinit or the number of timesteps would have a tiny 
impact in a few cases. We computed two realizations of each of the T' models, and four realizations 
of each of the power-law models. 

2.3 Normalization 

The amplitude of primordial density fluctuations cannot presently be calculated a priori from 
theory, and wc treat it as a free parameter of our models. The choice of amplitude is important, since 
larger amplitude perturbations produce stronger clustering and larger voids. [In an = 1 model, 
choosing the amplitude normalization is equivalent to choosing the time in an evolving AT-body 
simulation that corresponds to the present, and voids grow as a simulation evolves.] We normalize 
our models by making them fit an observational constraint, viz., the rms fluctuation of galaxy 
counts in spheres of radius 8 Mpc at the present epoch to. Estimates of the galaxy correlation 
function {e.g., Davis & Peebles 1983) imply that this quantity is close to unity, i.e., as,ga,\{to) — 1- 
This condition and the relation as^gmi = b C78,mass imply that for our unbiased models - where 
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galaxies are assumed to trace the mass distribution {b = 1) - the present epoch can be identified as 
the time when the rms mass fluctuation in spheres of radius 8 Mpc equals one, i.e., erg, mass — 1- 

In the T' models, the quantity erg, mass evolves at close to its linear theory rate throughout our 
simulations. We therefore normalize our initial conditions so that they have erg = 1 if extrapolated 
to z = by linear theory. For some of the power-law models, particularly n = 0, the growth of erg 
departs significantly from linear theory, as discussed by WC. We adopt the normalization amplitudes 
computed by WC (see their tabic 1), so that the non-linear values of erg in the simulations are equal 
to one at 2; = 0. The corresponding hnear theory amplitudes are erg = 1.48, 1.03, and 0.95 for 
n = 0, —1, and —2, respectively. 

The VPF is a sensitive function of the mean galaxy density n, or equivalently, of the charac- 
teristic inter-galaxy separation d = n~^/'^ (increasing the separation of galaxies increases the sizes 
of voids). It is therefore important that our simulated data have the same density as the real data 
to which they are being compared. Once we evolve our normalized initial conditions to z = 0, we 
therefore require that our simulated galaxies have the characteristic separation d = 4.5 Mpc 
of VGH's densest sample. To create an unbiased galaxy population with d = 4.5 Mpc, we 
randomly sample the particles in our final mass distribution to this density. 

The normalization scheme discussed so far ignores the possibility of biased galaxy formation. 
To create biased simulations with bias factors b = 1.5, 2, and 3, we reduce the amplitudes of the 
initial mass fluctuations by a factor of b, and evolve them to z = by an A^-body simulation as 
before. We then select a biased (instead of random) subset of the particles to represent galaxies. 
Our density biasing and peaks biasing prescriptions each contain an adjustable parameter that 
controls the strength of clustering in the biased particle subset. We select this parameter so that 
o's.gai = 1- To the extent that crg^niass follows linear theory, the ratio of rms galaxy fluctuations to 
rms mass fluctuations at the final epoch on the scale of 8 Mpc is b. 

Instead of trying a variety of bias factors, we could have normalized the mass fluctuations 
in our models by matching the amplitude of temperature fluctuations in the cosmic microwave 
background (CMB), as observed by COBE (Smoot et al. 1992). We did not take this approach, for 
several reasons. First, predicting CMB fluctuations on the large angular scales probed by COBE 
would require us to extrapolate our model power spectra to wavelengths much larger than those that 
influence the formation of voids. This extrapolation would make little physical sense for power-law 
spectra, and even for the T' models the resulting normalization would be unduly sensitive to our 
precise choice of parameters. For example, lowering F from 0.25 to 0.2 at fixed erg would have a 
negligible impact on void sizes, but it would significantly alter the relation between ug and COBE- 
scale CMB fluctuations. Second, there are still significant random and systematic uncertainties in 
the COBE fluctuation amplitude. Finally, it is possible that some fraction of the COBE signal 
arises from tensor mode (gravity wave) fluctuations instead of the scalar mode fluctuations that 
correspond to density perturbations (Davis et al. 1992; Liddle Sz Lyth 1992; Lidsey Sz Coles 1992; 
Lucchin et al. 1992; Salopek 1992; Souradeep & Sahni 1992). For what it is worth, EBW find 
that the best-fit COBE amplitude implies dg ^ 1.1, 0.5, and 1.25 respectively for our spatially flat 
models with (F,rio) = (0.5, 1), (0.25, 1), and (0.25, 0.3), assuming a truly scale-invariant primeval 
spectrum and no contribution from gravity waves. 

2.4 Biasing Schemes 

In principle, the existence and nature of biased galaxy formation should be a prediction of a 
theoretical model, not an input. A complete theory specifies both the initial conditions and the 
important physical processes for structure formation, and from these one should be able to compute 
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where the galaxies form and how they cluster relative to the mass. In practice, the necessary 
computations are very difficult, and there are uncertainties in the appropriate treatment of gas 
physics and star formation. Most large-scale structure simulations therefore rely on simplified 
prescriptions for identifying "galaxy" particles. Simplest of all is the assumption that galaxies 
trace the mass, which is the rule we adopt in our unbiased models. We also consider two diff'erent 
prescriptions for biased galaxy formation, one that identifies galaxies with particles that lie above 
a sharp threshold density in the final conditions, and one that associates galaxies statistically with 
high peaks of the initial density field. For the b = 1.5 CDM model we also try a somewhat more 
sophisticated approach, applying a galaxy identification algorithm that has been "calibrated" from 
cosmological simulations with hydrodynamics. 

This last scheme is based on the simulation of Cen &; Ostriker (1993, hereafter CO; see also 
Cen Sz Ostriker 1992), and we refer to it as 'C/0 biasing.' CO simulate the standard CDM model 
(J7 = 1, r = 0.5) with o"8,mass = 0.77 and a simulation volume 80 Mpc on a side. Their 
simulations include dissipation and star formation in a baryonic component, and they apply a 
percolation algorithm to group "star" particles into "galaxies" at z = 0. In their analysis, they 
fit the relation between the smoothed galaxy number density rigai and the smoothed mass density 
Pmass to the functional form 

log(ngal/ngal) =A + B log(pmass/pmass) + C [log(pmass/^mass)]^ • (1) 

CO list values of A, B, and C for Gaussian smoothing filters of various radii. We want to select 
particles from an iV-body simulation, and for this purpose it is easier to use cubic cells rather than 
Gaussian filters, since every particle is a member of a distinct cell. R. Cen has kindly computed 
the parameters of equation (1) for us using 2 Mpc cubic cells. To select biased particles, 
we compute the final mass density field on a 150^ grid and apply equation (1) to compute the 
galaxy density in each cell, using a mean galaxy density ngai equal to the density of VGH's densest 
observational sample (for which d = ngai""^^"^ = 4.5 Mpc). We then select particles from cell 
X with probability p = ngai(x)/np(x), where np{x) is the particle number density in the cell. We 
use cloud-in-cell weighting to compute the density field and to assign selection probabilities to 
particles. For 2 Mpc cells, Cen finds B = 1.9 and C = —0.20; the value of A is determined by 
the requirement that the mean density of selected particles equal the desired mean galaxy density 
'Pgai- As a check, we have computed galaxy and mass density fields from our biased and unbiased 
particle distributions, with Gaussian smoothing filters of radius 3 and 5 Mpc. The relation 
between smoothed galaxy density and smoothed mass density is then close to equation (1), as 
evaluated with the values of B and C that CO report for these smoothing radii. This agreement 
confirms that our particle selection scheme provides a reasonable representation of CO's biasing 
results. 

Of the three biasing schemes that we use, the C/0 scheme has the clearest physical motivation. 
However, one should bear in mind that the CO simulations have a resolution of only 400 kpc, 
which may not be adequate for identifying galaxies reliably. The simulations of Katz, Hernquist &: 
Weinberg (1992) and Evrard, Summers Sz Davis (1993), which model the baryon component using 
smoothed-particle hydrodynamics, have much higher spatial resolution in galaxy- forming regions, 
but the simulation volumes are not large enough to allow a meaningful definition of a biasing 
relation like equation (1). For now, we take CO's result as a plausible illustration of the sort of 
biasing relation that can be derived from cosmological simulations with gas dynamics and galaxy 
formation. Since the C/0 relation is derived specifically for standard CDM with b ~ 1.5, we apply 
it only to this particular model. Changing the normalization, the shape of the power spectrum, 
or the value of CI would alter the collapse times of galaxy-scale perturbations, so it is not at all 
obvious how the properties of biasing would respond to these changes. 
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Our 'density biasing' scheme is similar to the C/0 scheme discussed above, but instead of the 
non-hnear function (1), we simply adopt a sharp threshold — galaxies form with equal probability 
where the smoothed mass density exceeds some value, and they do not form at all where the mass 
density is below this value. Specifically, we create a final density field from the z = distribution 
of mass points, and smooth it with a Gaussian filter of radius Rg. We set Rs = 2.25 Mpc for 
consistency with the peaks biasing procedure described below. We then compute the smoothed 
density at the final location of each particle, and select some fraction /b of particles that have the 
highest local densities. As this fraction diminishes, the corresponding subset of particles becomes 
more and more biased towards the densest regions of the simulation. We randomly sample this 
biased subset to the desired inter-galaxy separation d = 4.5 Mpc, and we identify the resultant 
population of points as the 'galaxies' of interest. The value of /b is chosen so o"8,gai(io) for this 
biased and sampled subset matches that of the corresponding unbiased simulation. This sharp 
threshold in final density is the biasing scheme used by EEGS and by WC. 

The 'peaks biasing' scheme, which has been widely used in previous studies of CDM models, 
is simple in conception but complicated to implement in practice. One might naturally expect that 
peaks of the primordial density field would become the first sites of non-linear gravitational collapse 
(though Katz, Quinn & Gelb [1993] find a rather poor correspondence between galaxy-scale peaks 
of the initial density field and collapsed dark halos in A^-body simulations). The assumption behind 
peaks biasing is that bright galaxies form only at galaxy-scale peaks of the initial density field that 
lie above some global threshold vt = 5mass/cmass > 1- Lower height peaks arc assumed to form 
underluminous galaxies or structures not recognized as galaxies at all. A number of authors have 
discussed physical mechanisms that might lead to such a thresholding effect, among them Rees 
(1985), Silk (1985), Dekel & Silk (1986), White et al. (1987), Kaiser (1988), and Cole & Kaiser 
(1989). 

Kaiser (1984) showed that the high peaks of a Gaussian density field cluster more strongly 
than the underlying mass distribution, and he suggested that this phenomenon might explain the 
high amplitude of the observed cluster-cluster correlation function. BBKS applied the same idea 
to galaxy formation, and they computed many of the statistical and clustering properties of peaks 
of Gaussian fields. The strength of biasing between the high peaks and the underlying density 
field depends on the importance of long wavelength perturbations in the initial conditions. Long 
wavelength perturbations modulate the background density, and they therefore alter the local 
effective threshold height, i.e., the relative ease with which a peak relative to the local background 
can rise above a global threshold. In a Gaussian random field, the number density of rare peaks is 
a strong function of peak height, so small changes in the effective threshold height can lead to large 
changes in the local number density of peaks above a global threshold. If the primordial fluctuations 
have appreciable large-scale power (as in CDM), then the high peaks occur preferentially in regions 
of high background density. Bright galaxies that form around these high peaks are 'born' clustered, 
so they provide biased tracers of the underlying mass distribution. 

When applied to a given mass distribution, the peaks biasing scheme has two free parameters. 
The first is the smoothing scale for defining galaxy peaks, Rg (in our case, Rg is the radius of a 
Gaussian window function). This radius defines the characteristic physical size of the peaks, so it 
should correspond approximately to the mass scale of a typical galaxy. Since the precise relation 
between the smoothing scale and the collapsed mass is uncertain, we have a fair amount of freedom 
in our choice of Rg. The second free parameter is the peak threshold height ut - this measures the 
'difficulty' of galaxy formation. The values of Rg and Vt can be fixed by requiring that the biased 
model with which they are associated fit two observational numbers, usually taken to be the galaxy 
number density (or the associated characteristic separation d = n~^^^, in our case 4.5 Mpc) 
and the rms amplitude of galaxy count fluctuations on some scale (in our case, crg^gai — 1). 
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Our initial conditions do not resolve galaxy seales, so we implement peaks biasing via the peak- 
background split approximation, which allows one to identify particles with galaxy-scale peaks in 
a statistical manner, given the density field smoothed over a larger scale (BBKS; see also Park 
1991). The approximation relies on the fact that Sg - the full initial density field smoothed on the 
galaxy scale -Rg - has the same power spectrum and hence the same statistical properties as the 
field 5h + Sp, where 5b is a field with the same underlying power spectrum as 5g but smoothed on 
a larger, background scale i?b, and 6p is a 'peaks' field whose power spectrum is that of Sg minus 
that of 5b- One can therefore estimate the local density of peaks in above a global threshold 
ft as the number of peaks in 6p above a local effective threshold, whose value is modulated by 
the background field 5b- The use of the peak-background split approximation introduces a third 
parameter into our biasing prescription, the background smoothing length i?b- Given our choice of 
Rg (sec below), we adopt a Gaussian window radius = 2.25/i~^ Mpc. This value ensures that 
3 < Rh/Rg < 5, as required for best results in the peak-background split approximation (BBKS, 
Park 1991). With this condition satisfied, our results should not be sensitive to the exact value of 
i?b- 

In all our implementations of peaks biasing, we use equation (6.35) of BBKS to calculate 

the expected local number density npk(5b) of galaxy-scale peaks above Ut, as a function of the 
background field height 5b. We identify a mass particle that lies in a given cell x of the initial 
conditions as a 'galaxy' with a probability equal to ^ - F - npk[5b(a;)], where V is the cell volume 
and ^ is a constant of proportionality discussed below. At a mechanical level, this operation is 
quite similar to that in density biasing or C/0 biasing, except that we select particles based on 
the initial rather than the final density field, and we use the rather complicated relation between 
galaxy and mass density implied by the peaks formulae instead of a sharp threshold function or 
the C/0 relation. In a strict implementation of peaks biasing, the proportionality constant A is 
equal to one, and in this case the biasing operation has a clear physical interpretation: the selected 
particles correspond, statistically, to galaxy-scale peaks of the initial density field. To compare 
our models with VGH's brighter, less dense samples (those with d > 4.5 h~^Mpc), we randomly 
sample our biased particle distributions to the desired number density. By so doing, we make the 
implicit assumption that galaxy biasing is independent of luminosity, at least for galaxies brighter 
than those of VGH's densest sample (absolute magnitudes M < —18.5). We could instead have 
identified brighter galaxies with higher, rarer peaks, which are more strongly clustered. The first 
assumption makes physical sense if random factors independent of peak height determine a galaxy's 
final luminosity. 

At a fixed value of Rg, the strength of biasing is quite sensitive to the threshold ft. One might 
therefore think that the peaks biasing prescription could yield any bias factor of interest with an 
appropriate choice of parameters, in particular that for any value 1/3 ^ C8,mass ^ 1 one could find 
values of the peak parameters Vt and Rg that would yield crg^gai = 1. However, there is a second 
observational constraint to be matched, in our case the characteristic separation d = 4.5 /i~^Mpc. 
We find that with the galaxy density or separation imposed as a constraint, the strict peaks biasing 
prescription (with the proportionality constant A= 1) can yield only a rather narrow range of bias 
factors for a given mass fluctuation spectrum. 

This limitation arises because Rg and ft both affect d and <T8,gai in the same sense. If the 
smoothing length Rg is fixed, then increasing ft increases d (since higher peaks are rarer), and it 
increases crg^gai (since higher peaks are more strongly clustered). If the threshold height ft is fixed, 
then increasing Rg increases d (since it reduces the choppiness of the density field and thus increases 
the space between peaks), and it increases (T8,gai for a reason that is readily appreciated in terms 
of the peak-background split. Raising Rg reduces the amplitude of fluctuations associated with 
the 'peaks' fleld 5p, so the large-scale waves have more influence in raising local peaks above the 
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global threshold, making the global peaks more strongly biased towards regions of high background 
density (see BBKS equation 6.35). With the strict peaks biasing scheme, raising ft to get a higher 
bias factor also raises d. To keep d fixed (as we require for our comparison to the VGH data), Rg 
must be correspondingly lowered, but this also lowers b. These two effects on b nearly cancel in our 
models, so over the plausible range of peaks parameters the bias factor in a given model is nearly 
constant. 

For our F = 0.5 and F = 0.25 models, the strict peaks biasing prescription works well for 
o's.mass = 1/1-5 (5 = 1.5). The peak parameters Rg = 0.57 h~^Mpc and either ft = 1.8 (for 
F = 0.5) or ft = 1.6 (for F = 0.25) yield (Jg^gai ~ 1 and d !^ 4.5 Mpc, as desired. However, 
when (Tg.mass = 1/2 or 1/3, we cannot find peak parameters that produce a strong enough bias, 
i.e., if we require d = 4.5 h~^Mpc, then no choice of peak parameters gives us.gai = 1- We therefore 
adopt a relaxed version of the peaks prescription that can produce higher bias factors. To avoid 
the cancellation effect described above, we fix Rg at 0.57 h~^Mpc, and we vary ft to obtain the 
desired bias factor (i.e., to obtain fig^gai = 1). We then vary the proportionality constant A in order 
to achieve the desired characteristic separation, d = 4.5 h~^Mpc. For the 5 = 2 and 5 = 3 models 
we require ^ > 1, making the abundance of selected galaxy particles larger than the abundance 
of peaks. Although our biasing formula is based on the peaks approach, in these cases there is 
no direct correspondence between the selected particles and peaks. This is somewhat unsatisfying, 
but since the peaks model (and any other biasing model) is at best an approximation of physical 
processes whose details are not well understood, it seems sensible to loosen the model a bit when 
necessary. For 5=2 the values of d that we would obtain with A = 1 arc ~ 6 — 7 h~^Mpc, so 
for the two brightest VGH samples, which have d = 7.4 and 10.9 /i~^Mpc respectively, we can still 
identify the simulations' galaxy particles with a subset of the high peaks. 

3. Results 



3.1 Visual Appearance of the Models 

Figure 2 displays representative Cartesian slices from the final-time particle distributions of 
our models. Each panel displays a 15 /i~^Mpc thick slice through the simulation cube, with the 
galaxy population sampled to the characteristic separation d = 5.6 Mpc of VGH's Miim = —19 
samples. Axis scales are marked in h~^Mpc. Figure 2a displays slices from unbiased and density- 
biased power-law models (simulation cubes 192 h~^Mpc on a side). Its three rows show n = 0, 
— 1, and —2 models from top to bottom, and its four columns show 5=1, 1.5, 2, and 3 models 
from left to right. Figures 2b and 2c display galaxies from the 'F' models (simulation cubes 300 

Mpc on a side). The columns have the same 5 associations as Figure 2a, but Figures 2b and 
2c display peaks-biased and density-biased models, respectively. The three rows of these figures 
are associated with (F, ^o) = (0-5, 1, 0), (0.25, 1, 0), and (0.25, 0.3, 0.7) models, from top 
to bottom. Throughout Figure 2 the largest rounded voids are much smaller than the panels, 
indicating that our simulation cubes are indeed large enough to allow a fair measure of voids for 
the range of models that we consider. 

The initial conditions of all the models in Figure 2a were generated using the same random 
phases, as were all the models in Figures 2b and 2c. Thus within Figure 2a and throughout Figures 
2b and 2c, recognizable structures tend to form at similar locations in different slices. Each model 
nonetheless produces distinctive structure, and one can notice several trends. Comparing panels 
within a given row of Figure 2, as one moves from left to right the low density regions become 
emptier, and the high density regions become more diffuse. Thus biasing has two basic effects on 
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Figure 2a 



Figure 2a is a 0.5 Mbyte postscript file 
Figures 2b and 2c are 1.2 Mbyte postscript files 
You may obtain these files by request from 
dhw@guinness.ias.edu 



Figure 2 — Slices through the simulated galaxy distributions at z = 0. In every panel the rms galaxy fluctuation 
in spheres of radius 8 Mpc is unity, the characteristic inter-galaxy separation is d = 5.6 Mpc, and the 

thickness of the Cartesian slice is 15 Mpc. For all three plots, the four columns from left to right show models 
with bias factors 6=1, 1.5, 2, and 3. (a) Unbiased and density-biased power-law models, with n = 0, —1, and —2 
in the rows from top to bottom, (b) Unbiased and peaks-biased models, with (F, fio, Ao) = (0.5, 1, 0), (0.25, 1, 0), 
and (0.25, 0.3, 0.7) from top to bottom, (c) Same as (b), but with peaks-biased slices replaced by corresponding 
density-biased slices. 



spatial structure: it turns low density regions into completely empty voids, and it reduces the non- 
linearity of high density regions. The first effect arises because biasing suppresses galaxy formation 
in low density regions. The second effect arises because the mass fluctuations in our biased models 
are smaller, and they are therefore less efHcient at inducing gravitational collapse. Recall that all 
of the models pictured in Figure 2 have the same rms fluctuation in galaxy counts at 8 ^~^Mpc. 

Comparing corresponding b > 1.5 panels from Figures 2b and 2c, it is clear that the empty 
voids in density-biased models are larger than their peaks-biased counterparts. [Correspondingly, 
we will find that the density-biased VPFs in our simulations are always higher than their peaks- 
biased counterparts.] This is to be expected, since there is no more efficient way of creating voids 
than by 'density biasing', i.e., by completely eliminating all particles from low density regions of 
the final conditions. Peaks biasing, on the other hand, permits occasional galaxy formation in low 
density regions. The rounded, 'bubble-like' voids in many of the biased models are reminiscent of 
features in wedge diagrams from the CfA2 redshift survey (de Lapparent et al. 1986; 1991) or the 
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Figure 2b 



Figure 2 — continued 

Perseus-Pisces redshift survey (Haynes &: Giovanelli 1986; 1988). However, one should be cautious 
in drawing conclusions from visual comparisons, because properties of real-space, Cartesian slices 
like those in Figure 2 can be quite different from those of the redshift-space, magnitude-limited, 
declination wedges that are often used to display redshift survey results. 

Our spectra are normalized to have the same amplitude at 8 Mpc. In the power-law 
models of Figure 2a, the models with steeper spectra (lower n) have more power on larger scales 
and less power on smaller scales. Both of these properties manifest themselves clearly in the slices. 
Models with steeper spectra develop coherent features that can be traced over a larger fraction of 
the simulation cube, and models with shallower spectra develop final structure that is clumpier 
on small scales. The T' models of Figures 2b and 2c exhibit less variation in spatial structure 
than the power-law models because the diff'erences in the initial power spectra are themselves much 
smaller. However, comparing the = 1 models in Figures 2b and 2c, F = 0.5 in the top rows and 
F = 0.25 in the middle rows, one sees that void sizes increase slightly among biased models as F is 
lowered. It is also possible to detect smoother, more coherent filaments in the models with more 
large-scale power (lower F). Comparing the middle and bottom rows in the same figures shows the 
effect of lowering Qq while keeping the initial power spectrum fixed. Lowering Qq reduces peculiar 
velocities on all scales because there is less mass available to attract condensations away from pure 
Hubble flow. However, has little effect on final spatial structure if fluctuations are normalized 
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Figure 2c 



Figure 2 — continued 

to the final epoch — at the level of the Zel'dovich approximation or the adhesion approximation 
it has no effect at all (see Weinberg &; Gunn 1990). Some systematic differences between high-fio 
and low-f2o models appear in the fully non-linear regime (sec WC), but at the limited resolution 
of these slice plots, the difi"erences are essentially undetectable. In terms of visual appearance, the 
effects of biasing easily outweigh the effects of Qq or of the initial power spectrum. 

3.2 Measuring the VPF 

Our principal measure of the size and frequency of voids is the void probability function, the 
probability Pq (R) that a randomly placed sphere of radius R contains no galaxies. We also employ a 
related statistic, the probability PsoiR) that the average density in a randomly placed sphere is more 
than 80% below the global mean density. We also call Pso{R) the underdense region probability 
function, or UPF. White (1979) discussed the relation between count probability statistics, including 
the VPF, and correlation function statistics. Following White's original suggestion, many later 
studies of the VPF adopted the 'scaled' variables xi^O = ~'^^[Po{R)]/ ^ ^ where N and ^ are 
respectively the mean number of galaxies and the mean value of the two-point correlation function in 
spheres of radius R {e.g., Fry 1986, Maurogordato & Lachieze-Rey 1987, Fry et al. 1989; Lachieze- 
Rey, da Costa & Maurogordato 1992). This choice of variables allows one to test the so-called 
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hierarchical hypothesis, the assTimption that ah higher-order correlation functions can be expressed 
as sums of pairwise products of two-point correlation functions. However, the introduction of into 
the scaled radius variable obscures the relation between the scaled VPF, the actual 

sizes of voids. Since we are interested in the frequency of large voids rather than the hierarchical 
hypothesis, we have decided to stick with the simpler representation Po{R). 

VGH present VPFs for volume-limited subsets of the CfA rcdshift survey with the absolute 
magnitude limits Miim = —18.5, —19.0, —19.5, and —20.0 (for h = 1). The characteristic separa- 
tions corresponding to these four limits, derived from the galaxy luminosity function, are d = 4.5, 
5.6, 7.4, and 10.9 Mpc, respectively. The procedures described in §2 create simulated galaxy 
samples with a final-time separation of d = 4.5 Mpc. To compare our predictions with all the 
data of VGH, we subsequently sample this population to separations of ci = 5.6, 7.4, and 10.9 
Mpc. As mentioned in §2, because we randomly sample to compare to brighter subsets of the VGH 
data, our models implicitly assume that biasing (if any) is independent of galaxy luminosity for 
M < -18.5. 

For our power-law models, we present VPF results only in real space, ignoring possible distor- 
tions from peculiar velocities. Measuring the VPF in real space is simple because our simulation 
volume is triply periodic. We choose 2000 random points throughout the simulation volume, and 
about each point we count the number of galaxies in concentric spheres of radius 1, 2, 3, ... i?max 

Mpc, including particles across a periodic boundary if necessary. We adopt Rma.x = 35 /i~^Mpc, 
a radius by which Pso{R) drops to zero in nearly all of our models. We set Po{R) equal to the 
number of empty spheres of radius R divided by the total number (2000) of spheres of radius R 
placed in the simulation volume. Similarly, we set Pso{R) equal to the number of spheres that 
are at least 80% underdense, i.e., that contain N < 0.2(4:TtR^ /3d^) galaxies, divided by the total 
number of spheres of radius R. We have four independent simulations of each power-law model, 
and the VPFs that we plot are the average over these four runs. 

For our T' models we evaluate the VPF and UPF in real space and in redshift space, i.e., using 
the spherical coordinates (r, 6, 0) and {r+Vr/Ho, 9, (p), where Vr is the radial peculiar velocity with 
respect to the observer, and Hq is the Hubble constant. Adopting a t;r-dependent coordinate breaks 
the assumption of periodic boundary conditions, so our procedure for evaluating the VPF is more 
complicated than before. We want to measure the VPF in spherical rather than cubical samples 
because their redshift-space distortions are more likely to mimic those of a real observational sample 
— cubes have corners where the sample is unusually deep in the radial direction. We also want 
our samples to fill the simulation cube, so that wc take full statistical advantage of the simulation 
volume. In each 300 /i~^Mpc simulation cube, therefore, we select 8 'observers' located at the 
vertices of a smaller cube of side L/2 = 150 h~^Mpc. For each observer in turn, we shift particles 
(using the periodic boundaries) so that the observer lies at the center of the cube. Wc then move 
galaxies to redshift space and select a spherical sample containing all galaxies within a redshift 
distance -Rgampie = 130 Mpc of the observer. We choose 500 random points within a sphere of 
radius (.Rsampie — -Rmax), and about each point we count galaxies in concentric spheres of radii 1, 2, 
3, . . . -Rmax Mpc. We define Po{R) and Pso{R) as before, the number of empty or underdense 
spheres divided by the total number (500). We measure the real-space VPF and UPF in the same 
fashion, except that we omit the shift from real space to redshift space. All of a simulation's cubical 
volume appears in at least one of the cube's 8 samples, and most of it appears in more than one. 
There are two independent simulations of each T' model, and the results that appear in our figures 
are the average over all 16 (= 2 • 8) relevant samples. 

Figure 3 compares the real-space (solid line) and redshift-space (dotted line) VPFs of our 
unbiased, (F, Uq, Aq) = (0.25, 1, 0) model. [Following WC, we use d = 5.6 Mpc here and in any 
other figures that display the VPF for only one value of d.] At a given radius, the void probability 
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Figure 3 — The void probability function (VPF) in real space and redshift space. Po{R) is the probability that 
a randomly placed sphere of radius R (in Mpc) is empty of galaxies. Solid and dotted curves mark the real- 
space and redshift-space VPFs of our unbiased, (P, Qq, Aq) = (0.25, 1, 0) model, with d = 5.6 ft-^Mpc. VPFs 
tend to be slightly higher in redshift space than in real space, probably because coherent outflows from low density 
regions shift galaxies outward from the positions that they occupy in physical space, stretching voids along the 
line-of-sight. The dashed curve shows the mean VPF for a Poisson distribution of particles with the same value of d, 
Po(i?) = exp(— 47ri?^/3d^). VPFs of the AT-body models lie far above the Poisson VPF at large radii, as expected. 

tends to be slightly higher in redshift space than in real space, probably because coherent outflows 
from low density regions move galaxies outward from the positions that they occupy in physical 
space, thereby increasing the sizes of voids along the line-of-sight (cf. figure 6 of Regos Sz Geller 
1991). Although redshift-space VPFs in our simulations are almost always higher than their real- 
space counterparts, there are a few rare instances where they are lower - especially at large radii. 
Lower redshift-space VPFs can arise because velocity dispersions in groups and clusters scatter 
galaxies outward into regions that are empty in physical space. The differences between real-space 
and redshift-space VPFs are generally smaller than those in Figure 3 for models with smaller 
larger F, larger 5, or smaller d. The trends with Qq, F, and b are comprehensible, in that decreasing 
the mass density or the scale or amplitude of mass fluctuations leads directly to a decrease in 
large-scale peculiar velocities. The trend with d probably occurs because the coherent outflows 
associated with voids arc less evident at smaller separations. 

The dashed curve in Figure 3 shows the VPF expected for a Poisson distribution of points 
with characteristic separation d, Po{R) = exp(— 47ri?'^/3(?). The model VPFs lie far above the 
Poisson VPF at large radii, as expected. Nonetheless, in Figure 3 and in the VPFs of all of our 
other models, Pq drops to 10""* at radii much smaller than L/2 = 150 Mpc, confirming Figure 
2's indication that even our largest voids are much smaller than the simulation cube. 

3.3 Dependence of the VPF on Model Parameters 

3.3.1 Unbiased Models 

Figure 4 illustrates the dependence of the VPF on the shape of the initial power spectrum for 
unbiased models. Solid, dotted, and dashed lines represent the real-space VPFs for our unbiased 
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Figure 4 — Dependence of the VPF on the initial power spectrum, for unbiased models. Solid, dotted, and dashed 

lines show the real-space VPFs of unbiased power-law models with d = 5.6 h~^Mpc and n = 0, —1, and —2, 
respectively. Increasing n leads to a modest increase in the VPF, but the trend is rather weak. 

power-law models with n = 0, —1, and —2, respectively. The three curves are quite similar, showing 
that the VPF in unbiased models depends only weakly on the shape of the initial power spectrum. 
This insensitivity probably indicates that large empty regions grow from fluctuations near the 8 
h~^Mpc normalization scale, where all three models have the same rms amplitude. The VPFs in 
this figure nonetheless exhibit a systematic trend - as n is lowered, the VPF decreases slightly. The 
same trend is visible in the corresponding slice plots (column one of Figure 2a). WC also noted 
this effect among Gaussian models, and they cited two reasons for it. First, introducing small-scale 
power (increasing n) creates stronger negative fluctuations on small scales, and these lead directly 
to higher VPFs on those scales. Second, introducing small-scale power clumps galaxies on small 
scales, reducing the effective number of independent tracers; since the effective inter-particle density 
is thus lowered, the VPF goes up. Models with lower n have stronger fluctuations on large scales, 
but on these scales the fluctuation amplitude may be too low to clear out empty regions. 

Figure 5 illustrates the dependence of the VPF (or lack thereof) on Qq and Aq, for unbiased 
models. Five curves are plotted: solid, dotted, short-dashed, long-dashed, and dot-dashed lines for 
the real-space VPFs of unbiased, F = 0.25 models with the parameter combinations (Oq, Aq) = (1, 
0), (0.3, 0), (0.1, 0), (0.3, 0.7), and (0.1, 0.9), respectively. Four of the curves lie so close to each 
other that they are indistinguishable on this plot. Only the {Qq, Aq) = (0.1, 0) VPF stands slightly 
apart, and even in this case the differences in the tail of the VPF correspond to just one or two 
voids out of the 8000 randomly placed spheres. Figure 5 demonstrates that the real-space VPF is 
extremely insensitive to and Aq, if the power spectrum is held fixed. This insensitivity is not 
terribly surprising, since the final spatial structure is completely independent of Qq and Aq at the 
level of the adhesion approximation, and one expects dynamical non-linearities to distinguish these 
models mainly in collapsed, virialized regions. In redshift space we find a mild dependence of the 
VPF on because the magnitude of peculiar velocity distortions depends on Qq. However, we 
have already seen that the difference between real-space and redshift-space VPFs is small (Figure 
3), and the differences between redshift-space VPFs with the same power spectrum but different 
values of are smaller still. At fixed J7o, even the redshift-space VPF is insensitive to Aq because 
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Figure 5 — Dependence of the VPF on fio (the cosmic density parameter) and Aq (the cosmological constant). 
Different line types show real-space, d = 5.6 Mpc VPFs of five unbiased, P = 0.25 models with five diflterent 

combinations of Qq and Ao, as labeled. Four of these VPFs are so similar that the curves are indistinguishable on 
this plot, and the fifth (Qq = 0.1, Ao = 0) is only marginally different. Within the parameter range shown here, fio 
and Ao have essentially no impact on the real-space VPF. The value of fio has a small effect on the redshift-space 
VPF, since it determines the amplitude of peculiar velocities. 

peculiar velocities depend almost entirely on the density parameter rather than the cosmological 
constant. 



3.3.2 Biased Models 

Figure 6 illustrates the dependence of the VPF on the prescription adopted for biased galaxy 
formation, at a fixed bias factor b = 1.5 and a characteristic separation d = 5.6 h~^Mpc. Solid, 
dashed, and dotted lines display real-space VPFs for standard CDM models (F = 0.5, JIq = !)> 
biased tob = 1.5 with peaks, density, and C/0 biasing, respectively. Density biasing produces much 
higher VPFs than peaks biasing, as one would expect given the visual appearances of Figures 2b 
and 2c. This difference between the two biasing schemes holds for all of our models; it is even more 
pronounced for F = 0.25 (more large-scale power). Peaks biasing and C/0 biasing, on the other 
hand, produce nearly identical VPFs. Although the difference between these schemes is somewhat 
larger for other values of d, it is always very small. 

Figure 7 illustrates the dependence of the VPF on the bias factor b, for density biasing (Figure 
7a) and peaks biasing (Figure 7b). Figure 7a is designed for comparison with figure 4b of EEGS, 
which also shows real-space VPFs for spatially flat, low-fio) CDM models with different bias factors. 
Except for our choice of smoothing filter and our dilution to a fixed galaxy density, our density 
biasing scheme is the same as EEGS's biasing scheme. In Figure 7a, solid, dotted, short-dashed, 
and long-dashed lines represent real-space VPFs of our (F, Oq, Aq) = (0.25, 0.3, 0.7) models, with 
biasing factors of 5 = 1, 1.5, 2, and 3, respectively. In both EECS's figure 4b and our Figure 
7a, increasing the bias factor raises the VPF. However, the form of this trend differs dramatically 
in the two cases. In EEGS's figure, four successive increases in the bias factor produce a steady 
and uniform march of the VPF to appreciably higher values. EEGS's median void radii display a 
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Figure 6 — Dependence of the VPF on the prescription for biased galaxy formation, at a fixed value of the bias 
factor b. Solid, dashed, and dotted lines show real-space, d = 5.6 h~^Mpc VPFs for b = 1.5, 'standard CDM' models 
with peaks biasing, density biasing, and C/O biasing, respectively. Density biasing creates larger voids and a higher 
VPF than peaks biasing; the same trend holds for other models, other bias factors, and other values of d. Peaks 
biasing and C/O biasing yield similar VPFs. 




R R 

Figure 7 — Dependence of the VPF on the bias factor b. (a) Real-space, d = 5.6 Mpc VPFs for unbiased 

and density-biased models with F = 0.25, Qq = 0.3, and Aq = 0.7. The VPF jumps sharply between fe = 1 and 
b = 1.5, but it increases only slowly over the entire remaining range, from b = 1.5 to 6 = 3. This behaviour contrasts 
with that in EEGS's figure 4b, to which this plot can be compared, (b) Same as (a), but with density-biased VPFs 
replaced by corresponding peaks-biased VPFs. The latter are more like the corresponding unbiased VPF, though 
again there is a larger jump between 6=1 and 6 = 1.5 than between higher bias factors. 



similar steady trend (see their table 2 and figure 6). It is exactly this systematic dependence which 
suggested that the VPF might be a sensitive indicator of biasing, and perhaps a statistical tool 
with which to determine the bias factor. In our Figure 7a, however, the VPF increases only weakly 
over the entire range 1.5 < 6 < 3, despite a large increase between 6=1 and b = 1.5. This result 
is consistent with the visual appearance of successive columns in Figure 2c. 
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Figure 8 — Dependence of the VPF on the initial power spectrum, for biased models. Solid, dotted, and dashed 
lines show the real-space VPFs of density-biased power-law models with d = 5.6 h~^Mpc and n = 0, —1, and —2, 
respectively. The first two VPFs are similar, but the biased n = —2 VPF is much higher, as biasing turns the 
n = —2 model's large underdense regions into empty voids. The trend for biased models is quite different from that 
for unbiased models (cf. Figure 4). 

There are many differences between our simulations and those of EEGS, but the difference in 
normalization procedure is probably the most important for explaining the different dependence 
on the bias factor. For their CDM models, EEGS take a fixed underlying mass distribution and 

choose successively more strongly biased subsets of the particles. Each increase in the bias factor 
yields a 'galaxy' population that is more strongly clustered in an rms sense, and it is not surprising 
that void sizes increase each time. However, at most one of these biased subsets can match the 
known rms fluctuation of galaxy counts as measured from the two-point correlation function. We 
adopt the observed rms fluctuation at 8 h~^Mpc as a constraint on our 'galaxy' populations, so we 
accompany each increase in the bias factor with a corresponding decrease in the amplitude of the 
underlying mass fluctuations. The reduction in mass fluctuations counteracts the stronger biasing; 
in an rms sense the two effects cancel, by construction. Void sizes arc sensitive to the efficiency of 
galaxy formation in low density regions, so some dependence of the VPF on b remains, but this 
dependence is strong only between 6=1 and b = 1.5. 

Figure 7b illustrates the dependence of the VPF on the bias factor for the peaks biasing pre- 
scription. It is obtained by replacing the density-biased VPFs of Figure 7a with their peaks-biased 
counterparts. Once again the VPF is more sensitive to the difference between 6 = 1 (unbiased) and 
6 = 1.5 than to the differences between b = 1.5, 2, and 3. However, the jump from 6 = 1 to 6 = 1.5 
is much smaller than it is for density biasing, and the growth of the VPF with b is somewhat more 
steady. 

Figure 8 illustrates the dependence of the VPF on the shape of the initial power spectrum 
for biased models (analogous to Figure 4 for unbiased models). Solid, dotted, and dashed lines 
display the real-space VPFs of our 6 = 2, density-biased, power-law models with n = 0, —1, and 
—2, respectively. The n = and n = —1 models have similar VPFs, but the n = —2 VPF is much 
higher. This behaviour contrasts with that in Figure 4, where all three models have similar VPFs 
and the n = — 2 VPF is the lowest. WC found similar results for their biased models. While the 
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large-scale fluctuations in the n = —2 models are too weak to clear out empty voids by gravity 
alone, they do create large regions of low mass density, and biasing turns these regions into large, 
empty voids. We find the same trend — larger voids in biased models with more large-scale power 
— when comparing our T = 0.5 and F = 0.25 models, using either density biasing or peaks biasing. 
If one knew that galaxy formation were biased and one knew the specific nature of the bias, one 
could conceivably derive constraints on the primordial power spectrum from the VPF. However, 
given the uncertainties in the appropriate choice of biasing scheme, the VPF is probably not a 
useful diagnostic for P{k). Voids tell us more about the relation between galaxies and mass than 
about the power spectrum of the mass distribution itself. 

3.4 Basic Comparison to the Data 

The Vogclcy, GcUcr & Huchra (VGH) void probability data come from three magnitude-limited 
redshift surveys, 'CfAl North', 'CfA2 North', and 'CfA2 South'. The 'CfAl North' survey consists 
of the northern part of the original CfA survey (Huchra et al. 1983), which is complete to a limiting 
apparent magnitude of 14.5, covers the range S > 0°, b^^ > 40°, and contains 1833 galaxies. The 
'CfA2 North' survey comes from the extension of the CfA survey to the limiting magnitude 15.5 
(see Geller &; Huchra 1989). It covers the range 8^^ < a < 17^^, 26.5° <S < 44.5°, and it contains 
2556 galaxies. The 'CfA2 South' survey, also from the CfA extension, contains 2414 galaxies in 
the range 20^* < a < 4^*, 6° < S < 36°. From each of these three surveys, VGH construct four 
volume-limited samples, the absolute magnitude of the faintest galaxies in these samples being 
-^lim = —18.5, —19.0, —19.5, and —20 (for h = 1). The corresponding characteristic inter-galaxy 
separations, computed from the galaxy luminosity function, are d = 4.5, 5.6, 7.4, and 10.9 
Mpc, respectively (see table 1 of VGH). These 12 volume-limited samples contain between 182 and 
627 galaxies each, and they range in depth from 40 to 126 Mpc. 

VGH computed VPFs for each of these 12 observational samples, and M. Vogeley has kindly 
provided these data to us in the form of computer files. Figure 9 compares the observational 
data to the predictions of our 'F' models. Open circles, asterisks, and open triangles display the 
VPFs measured from 'CfA2 North', 'CfA2 South', and 'CfAl North', respectively. The four rows 
display VPFs for d = 4.5, 5.6, 7.4, and 10.9 Mpc samples, from top to bottom. The same 12 
observational VPFs appear in each column of panels in Figure 9. The trend toward larger VPFs 
from top to bottom within a given column reflects the simple fact that increasing the characteristic 
inter-galaxy separation (decreasing the galaxy density) increases the sizes of empty voids. We have 
not attached error bars to the points of Figure 9; we will consider the effects of finite volume errors 
in the next section. For now we note simply that the three, largely independent observational 
samples give fairly consistent but not identical results. 

Figures 9a, 9b, and 9c compare the VGH results to the redshift-space VPFs of our (F, J7o, Aq) 
= (0.5, 1, 0), (0.25, 1, 0), and (0.25, 0.3, 0.7) models, respectively. We do not include a comparison 
to the power-law models because they have less observational and theoretical motivation; WC show 
some comparisons between these models and the VGH data. In each panel of Figure 9, the solid, 
dotted, short-dashed, and long-dashed lines are associated with b = I, 1.5, 2, and 3, respectively. 
Left hand columns show results for peaks biasing, right hand columns for density biasing (the solid 
lines show unbiased models in each case). The dot-dashed lines in the right hand column of Figure 
9a are associated with the C/0 biasing scheme, for b = 1.5. Figure 9 exhibits many of the model 
trends seen in previous figures. For example, it confirms that density biasing produces significantly 
larger VPFs than peaks biasing, and that peaks biasing and C/0 biasing produce nearly identical 
VPFs (cf . Figure 6) . Figure 9 also confirms that the VPFs of unbiased models depend only weakly 
on Ao, and the initial power spectrum (cf. Figures 4 and 5), and that increasing large-scale 
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Figure 9a 
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Figure 9 — Comparison between observed VPFs and the VPFs of our best-motivated theoretical models. Within 
each panel: (i) the circles, asterisks, and triangles are the observed, redshift-space VPFs calculated by Vogeley, 
Geller, and Huchra (1991; VGH) from volume-limited samples of the 'CfA2 North', 'CfA2 South', and 'CfAl North' 
galaxy redshift surveys, respectively; (ii) the solid, dotted, short-dashed, and long-dashed lines are redshift-space 
VPFs calculated from our theoretical models with & = 1, 1.5, 2, and 3, respectively. Within each of the three plots: 
(i) the four rows from top to bottom show VPFs for characteristic inter-galaxy separations d = 4.5, 5.6, 7.4, and 
10.9 Mpc; (ii) the b > 1.5 lines in the left and right columns are associated with peaks biasing and density 
biasing, respectively. The dot-dashed lines in the right column of (a) are the redshift-space VPFs from our b = 1.5, 
C/O-biased model. Figures (a), (b), and (c) show model VPFs for (F, fio, Aq) = (0.5, 1, 0), (0.25, 1, 0), and 
(0.25, 0.3, 0.7), respectively. These VPFs exhibit many of the model-dependent trends discussed earlier. Unbiased, 
peaks- biased, and C/O-biased models generally reproduce the VGH data fairly well, while density-biased models 
tend to create an excess of large voids. 
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Figure 9b 



Figure 9 — continued 



power can substantially increase density-biased VPFs (cf. Figure 8) - as F is lowered, Figure 
9's density-biased VPFs increase appreciably. This effect is also present, though less pronounced, 
among peaks-biased models. Finally, within a given panel of Figure 9 the difference between 6 = 1 
and b> 1.5 VPFs is often much larger than the difference among 6 > 1.5 VPFs (cf. Figure 7). This 
effect is especially strong among density-biased models and models with more large-scale power, 
because they are least like their unbiased counterparts. The effect is weaker for larger values of 
d because increasing d raises unbiased VPFs relative to their biased counterparts. The impact 
of increasing d is evidently greater when a smaller fraction of the total volume is already empty 
{i.e., in unbiased models). 
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Figure 9c 



Figure 9 — continued 

Turning to a comparion with the data, it is clear that the density-biased models of Figure 
9 generally tend to overproduce voids. This tendency weakens (and occasionally reverses) as d 
increases {i.e., as the observed VPFs go up), or as large-scale power or JIq decrease (so that model 
VPFs go down). However, all of our density-biased models match the data less well than their 
peaks-biased or C/O-biased counterparts. The slice plots of density-biased models (Figure 2) share 
the 'bubbly' appearance of the CfA2 slices, but the model voids are considerably larger than those 
in the CfA2 survey. 

Most of the unbiased, peaks-biased, and C/O-biased models of Figure 9 match the VGH data 
fairly well. Unbiased models typically fit best at small d, while peaks-biased or C/ 0-biased models 
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fit better at larger d, with 6 ~ 1.5 — 2 models perhaps the most successful overall. Among unbiased 
and peaks-biased models, the data do not seem to strongly prefer any one of our three (F, Qq, Aq) 
= (0.5, 1, 0), (0.25, 1, 0), (0.25, 0.3, 0.7) parameter combinations. The high-F models fit better at 
d = 5.6 ^~^Mpc, but they underproduce large voids at J= 10.9 ^~^Mpc, where the low-F models 
are more successful. All of our biased models overestimate the VPF at d = 5.6 Mpc, while all 
of our unbiased models underestimate the VPF at d = 10.9 Mpc, so no one model offers an 
ideal fit to the data. However, we have not yet considered the effects of finite volume errors, so we 
cannot draw quantitative conclusions. 

3.5 Error Estimates for Specific Models 

Although the CfA2 survey is one of the deepest and most ambitious redshift surveys to date, 
the total volume that it probes does not exceed the volume of the largest low density regions by an 
enormous factor. Estimates of the VPF are therefore subject to significant finite volume errors — 
a distant observer mapping a volume of the universe of the same size might find a different VPF 
because of statistical fluctuations in the local structure. These finite volume effects dominate over 
other sources of error {e.g., magnitude and redshift errors) in the observed VPF. To decide whether 
a particular theoretical model is consistent with the VGH data, we must ask whether observers 
mapping equal volumes of the model universe could reproduce the observed VPF a reasonable 
fraction of the time. 

The best way to carry oTit such a comparison between simulations and data is to draw from 
the numerical models simulated data sets with the same geometry and selection effects as the 
observational survey, then analyze the simulated and real data in an identical fashion. We do 
not know precisely what procedures VGH used for defining their samples and placing their test 
spheres, so here we adopt the simpler procedure of analyzing spherical samples with the same 
volumes as the VGH samples. This method should still yield reasonable estimates of the finite 
volume fluctuations for each sample. Within each of the two simulations of a given model, we 
place 50 'observers' {i.e., origins) at random positions. Around each observer in turn, we move the 
particles into redshift space and measure the VPF in eight spherical samples, with volumes chosen 
to match those of VGH's eight volume-limited CfA2 samples. The radii -Rgampie of these spheres 
range from 22.9 Mpc (giving the volume of the d = 4.5 Mpc, CfA2 North sample) to 54.7 

Mpc (giving the volume of the d = 10.9 Mpc, CfA2 South sample). In each sample sphere 
we randomly select a fraction (4.5 /i~^Mpc/ds)'^ of the particles, where d^ is the characteristic 
separation of the associated CfA2 sample, so that on average the mean galaxy density in our 
samples matches that of observed galaxies at the associated absolute magnitude limit. We measure 
the redshift-space VPF by counting galaxies in spheres of radius 1, 2, i?max h~^M])c, placed at 
8000 random positions out to a distance (-Rsamp — -Rmax) from the observer. We adopt i?max = 12, 
14, 16, and 22 /I'^Mpc for the four successive values of d. These values are large enough to go 
beyond the largest voids found by VGH at each d. The larger i?max, the smaller the volume we 
have in which to place test spheres, so we always attempt to keep -Rmax significantly smaller than 

-^sample* 

Our procedure produces 100 (= 50 observers per realization x 2 realizations) measures of the 
VPF for each of the eight CfA2 sample volumes, and we sort these 100 measures from the lowest 
to the highest VPF at each radius R. The 5th, 25th, 75th, and 95th positions in this ranking 
represent our best estimates of the 5%, 25%, 75%, and 95% VPFs associated with a given model. 
At a given radius, one out of 20 observers measures a VPF value lower than the 5% VPF, and one 
out of 20 measures a VPF higher than the 95% VPF. Half of the observers measure a VPF between 
the 25% and 75% levels. If the observational data fall in this inter-quartile range, then the model 
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matches the data well. If the data fall in the 5%— 95% range, then the model and data agree at 
the "2(t" level. Successive points on the VPF are not independent, since a large void at one radius 
must contain voids at all smaller radii. VPFs at different d are also not independent because a 
void that is empty at one d must be empty at all larger d. These interdependences make it difficult 
to combine multiple VPF measurements for multiple samples into overall model likelihoods. We 
will not attempt to solve this statistical problem in this paper; we leave the task of devising useful 
likelihood tests to a more detailed comparison study that applies matched procedures to simulations 
and observations. 

We cannot attach "error bars" directly to the observational data in a model-independent way 
because we do not know the underlying statistical distribution from which the real data are drawn. 
The expected finite-volume errors are different for each theoretical model — models with larger 
voids also have larger sample-to-sample fluctuations in the VPF — so we must compare the data 
individually to each model in turn. Figures 10a, 10b, and 10c show such a comparison for three 
cases that span the range of our T' model VPFs: the unbiased, (F, Qq, Aq) = (0.25, 0.3, 0.7) model, 
the b = 1.5, C/O-biased, (F, rio, Aq) = (0.5, 1, 0) model, and the 6 = 3, density-biased, (F, Qq, Aq) 
= (0.25, 1, 0) model. The first of these is a representative unbiased model, the second is arguably 
the most physically motivated of our biased models, and the third creates the largest voids of all 
our T' models. Left and right hand columns of Figure 10 are associated with the sample volumes 
of 'CfA2 North' and 'CfA2 South', respectively, and the four rows correspond to d = 4.5, 5.6, 7.4, 
and 10.9 Mpc, from top to bottom. Circles (left hand columns) and asterisks (right hand 
columns) show the VGH data. The pair of solid lines in each panel shows the 5% and 95% VPFs 
of the corresponding model; the region between them can be regarded as a "2(7 error corridor" for 
the model. The pair of dotted lines shows the 25% and 75% VPFs. Both pairs of lines diverge as 
Pq descends because of the logarithmic scale on the Po-axis. 

Beginning with Figure 10a, we see that the VGH data generally lie between the 75% and 95% 
VPFs of the unbiased, (F, Qq, Aq) = (0.25, 0.3, 0.7) model. Although the model systematically 
underproduces large voids on average, it is consistent with the data at the 2a level. The only 
exception is the large-i? tail of the d, = 4.5 h~^Mpc, northern VPF, which lies slightly above the 
model's 95% VPF. This tail is the one region of qualitative discrepancy between the VPFs of the 
CfA2 North sample and the corresponding VPFs of the other two VGH samples, so we are reluctant 
to place much weight on it. Putting it aside, we conclude that the VGH data are consistent with 
an unbiased Gaussian model, and that the voids in the CfA2 survey provide no convincing evidence 
for biased galaxy formation. 

Figure 10b shows still better agreement between model and data. Most of the VGH results 
lie between the_25% and 75% VPFs of this b = 1.5, C/O-biased, standard CDM model. Again 
the tail of the d = 4.5 h~^Mpc, northern VPF lies just above the model's 95% VPF. Apart from 
this discrepancy, the VGH data agree well with this Gaussian model, which incorporates the sort 
of modest bias predicted by cosmological simulations with gas dynamics (CO; Katz et al. 1992). 
Similar results obtain for peaks-biased models. 

Figure 10c shows results for our most severely biased T' model. For d < 7.4 /i~^Mpc, the 
VGH data lie mostly between the 5% and 25% VPFs, confirming the tendency of this model to 
overproduce large voids. However, even though the mean VPF predicted by this model is a poor 
match to the data (see Figure 9b), the model remains consistent with the current observations 
because it predicts large VPF variations in samples of this size. This is probably a specific example 
of a general problem: models with large voids also have large sample-to-sample fiuctuations in the 
VPF, so larger data sets are needed to rule them out. 

The overall message of this section may seem somewhat discouraging. The data do not con- 
vincingly exclude any of these three models, even though the models span a large range in predicted 
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Figure 10 — Detailed comparison between three of our models and the VGH data, including the effects of finite 
volume fluctuations. Circles (left panels) and asterisks (right panels) show the VGH data from 'CfA2 North' and 
'CfA2 South,' respectively, with d = 4.5, 5.6, 7.4, and 10.9 /i~^Mpc from top to bottom. For each of these eight 
samples, we measure the VPF from 100 simulated data sets drawn from the theoretical model, each with the same 
volume as the corresponding observational sample. Solid lines in each panel mark the 5th-lowest and 5th-highest 
VPFs of the 100 model data sets; the region between them can be regarded as a "2a error corridor" for the model. 
Dotted lines mark the 25th-lowest and 25th-highest VPFs, and thus represent the model's inter-quartile range, (a) 
The unbiased, (F, CIq, Aq) = (0.25, 0.3, 0.7) model. The data are consistent with the model at the "2it" level, except 
for the large-i? tail in the northern, d = 4.5 h~^Mpc sample, (b) The b = 1.5, C/O-biased, (F, f2o, ^o) = (0.5, 1, 
0) model. This model provides an excellent fit to the data overall, though again there is a small area of discrepancy 
in the tail of the northern, d = 4.5 ft~^Mpc sample, (c) The 6 = 3, density-biased, (F, Qq, Aq) = (0.25, 1, 0) 
model. Although the model tends to overproduce voids at the smaller values of d, it predicts large sample-to-sample 
variations in the VPF (wide error corridors), so it cannot be ruled out by the current data. 
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Figure 10b 



Figure 10 — continued 



VPFs. However, the situation should improve in the near future, as the CfA extension has now 
been completed over a larger fraction of the sky. M. Vogeley (private communication) reports that 
VGH have now doubled the size of their observational sample, and that the VPF results for this 
larger sample are similar to those for the original data set. With the doubled sample, it may well 
be possible to rule out severely biased models, and perhaps even unbiased models. In the longer 
run, massive redshift surveys like the Sloan Digital Sky Survey (Gunn & Knapp 1993) will probe 
much greater volumes and yield highly accurate estimates of the VPF, placing correspondingly 
tight constraints on theories of biased galaxy formation. 
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Figure 10c 



Figure 10 — continued 

3.6 The Probability Function of Underdense Regions 

Figure 11 illustrates the 'underdense probability function' P^q{R) for a variety of our T' 
models. Its three rows display redshift-space UPFs for our (r, Oq, Aq) = (0.5, 1, 0), (0.25, 1, 0), 
and (0.25, 0.3, 0.7) models, from top to bottom. The figure's columns and line types have the same 
biasing associations as Figure 9's (including the association of the dot-dashed line in the top right 
panel with h = 1.5, C/0 biasing). Since it is always easier to find an 80% underdense region than 
a totally empty one, the UPFs have higher amplitudes at a given radius than their corresponding 
VPFs (cf. Figure 9). Also, these redshift-space UPFs - like redshift-space VPFs - are somewhat 
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Figure 11 — The underdense probability functions (UPFs) of our T' models. Pso{R) is the probability that the 
average galaxy density within a randomly placed sphere of radius R (in Mpc) is more than 80% below the global 
mean density. Unlike the VPF, the UPF is independent of the space density of the tracer population, except for shot 
noise. In each panel, solid, dotted, short-dashed, and long-dashed lines represent UPFs for models with 6=1, 1.5, 
2, and 3, respectively. The b > 1.5 lines in the left and right columns are associated with peaks biasing and density 
biasing, respectively. The dot-dashed line in the upper right panel represents the b = 1.5, C/O-biased model. The 
model parameters F, CIq, and Ao are listed in each panel. 

larger than their real-space counterparts (of. Figure 3). 

Figure 11 confirms several of the d-independent trends of Figure 9. For example: (i) density 
biasing produces significantly larger voids than peaks biasing; (m) peaks biasing and C/0 biasing 
produce statistically similar voids; (Hi) among biased models, the sizes of voids generally increase 
as r is lowered; {iv) the differences in void structure between 5=1 and b = 1.5 models can be 
much larger than those among 6 > 1.5 models, and this effect is stronger among models with more 
large-scale power. The UPF of unbiased models increases with increasing large-scale power, i.e., it 
is higher for lower F. WC found a similar result for power-law models. This trend is opposite to 
that of the unbiased VPF (again consistent with WC). 

One advantage of the UPF over the VPF is its simplicity. It is independent of galaxy density, 
except for shot noise, which becomes progressively less important at larger radii. As a result, 
a single UPF can be measured directly from magnitude-limited data, without defining multiple 
volume-limited samples that have multiple values of d. (However, additional complications enter if 
galaxy clustering depends systematically on luminosity.) Because the UPF falls more slowly with 
radius than the VPF, it can be measured out to larger scales in a given observational sample. We 
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do not know of any existing observational data on the UPF, so the curves in Figure 11 will have 
to stand as blind predictions. UPF results from the CfA2 survey should be available in the near 
future (M. Vogeley, private communication). 

4. Conclusions 

We have investigated the behaviour of the void probability function (VPF) in a wide range of 
initially Gaussian models for the origin of large-scale structure. We have focused on the sensitiv- 
ity of the VPF to assumptions about biased galaxy formation. Our main conclusions about the 
dependence of the VPF on model parameters are as follows. 

1. Peculiar velocity distortions have relatively little effect on the VPF, though VPFs measured 
in rcdshift space tend to be slightly higher than those measured in real space. 

2. The real-space VPF is extremely insensitive to the cosmic density parameter and the 
cosmological constant Aq, provided that models with different values of these cosmological 
parameters have the same initial power spectrum and biasing prescription and are normalized 
to the same rms mass fluctuation at the present epoch. 

3. In the absence of biasing, the VPF depends only weakly on the shape of the initial power 
spectrum, provided that models are normalized to erg mass ~ 

1 today. Unbiased models with 

more small-scale power have slightly higher VPFs. 

4. The VPF is quite sensitive to the prescription adopted for biased galaxy formation. Density- 
biased models produce much larger voids than peaks-biased models, or models that incorporate 
go's non-linear biasing prescription. Peaks-biased models have higher VPFs than unbiased 
models. 

5. The VPF depends more strongly on the form of the biasing prescription {e.g., density biasing 
versus peaks biasing) than on the bias factor b; for a given biasing scheme, the VPF does not 
change much in the range 1.5 < 6 < 3. Given the physical uncertainty in the appropriate 
choice of biasing scheme, the VPF is probably not a useful tool for determining b. However, 
the VPF can distinguish unbiased models from some biased models, and it can place useful 
constraints on the relation between galaxies and mass. 

6. Conclusions 1 — 5 also apply to the underdense region probability function (UPF), except that 
the trend of the unbiased VPF with power spectrum (conclusion 3) is reversed for the UPF. 

We have compared our model predictions to the data of VGH, who have measured VPFs of 
the original GfA survey and the completed regions of the GfA extension. Our main conclusions 
from comparison between models and the VGH data are as follows. 

7. Unbiased Gaussian models reproduce the observed VPFs fairly well. The void probability data 
from the GfA survey do not provide compelling evidence for biased galaxy formation. 

8. Models that incorporate moderate levels of biasing, similar to those predicted by cosmological 
simulations with gas dynamics, produce the best overall fit to the VGH data. 

9. Density-biased models tend to overproduce large voids, at least on average. 

10. Large finite volume fluctuations are expected in samples of the size analyzed by VGH, and 
models that predict the largest voids also predict the largest sample-to-sample variations in 
the VPF. As a result, the VGH data do not convincingly rule out any of our models. 

The opening paragraph of this paper posed two broad questions. Can the gravitational growth 
of Gaussian primordial fluctuations account for the observed voids, or do initially Gaussian models 
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require that galaxy formation be suppressed in low density regions in order to produce voids as 
large and empty as observed? Do voids represent regions where there is no mass, or merely regions 
where there are no (bright) galaxies? Conclusion (10) tells us that we cannot offer convincing 
answers to these questions with the present observational data, at least not using the VPF as our 
primary tool. The VGH data are consistent with the hypothesis that voids in the CfA survey grew 
gravitationally from Gaussian primordial fluctuations, and that these voids are truly as underdense 
in mass as they are in galaxies. However, the data also permit a substantial bias between galaxies 
and mass, with the voids underdense, but not nearly so empty as they appear. Analysis of future 
redshift surveys will help to distinguish these possibilities. 
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